Search CORE

24 research outputs found

On some orthogonalization schemes in Tensor Train format

Author: Coulaud Olivier
Giraud Luc
Iannacito Martina
Publication venue
Publication date: 16/01/2024
Field of study

In the framework of tensor spaces, we consider orthogonalization kernels to generate an orthogonal basis of a tensor subspace from a set of linearly independent tensors. In particular, we experimentally study the loss of orthogonality of six orthogonalization methods, namely Classical and Modified Gram-Schmidt with (CGS2, MGS2) and without (CGS, MGS) re-orthogonalization, the Gram approach, and the Householder transformation. To overcome the curse of dimensionality, we represent tensors with a low-rank approximation using the Tensor Train (TT) formalism. In addition, we introduce recompression steps in the standard algorithm outline through the TT-rounding method at a prescribed accuracy. After describing the structure and properties of the algorithms, we illustrate their loss of orthogonality with numerical experiments. The theoretical bounds from the classical matrix computation round-off analysis, obtained over several decades, seem to be maintained, with the unit round-off replaced by the TT-rounding accuracy. The computational analysis for each orthogonalization kernel in terms of the memory requirements and the computational complexity measured as a function of the number of TT-rounding, which happens to be the most computationally expensive operation, completes the study

arXiv.org e-Print Archive

Un algorithme GMRES robuste au format tensor train

Author: Coulaud Olivier
Giraud Luc
Iannacito Martina
Publication venue: HAL CCSD
Publication date: 13/09/2022
Field of study

We consider the solution of linear systems with tensor product structure using a GMRES algorithm. In order to cope with the computational complexity in large dimension both in terms of floating point operations and memory requirement, our algorithm is based on low-rank tensor representation, namely the Tensor Train format. In a backward error analysis framework, we show how the tensor approximation affects the accuracy of the computed solution. With the bacwkward perspective, we investigate the situations where the

(d+1)

-dimensional problem to be solved results from the concatenation of a sequence of

d

-dimensional problems (like parametric linear operator or parametric right-hand side problems), we provide backward error bounds to relate the accuracy of the

(d+1)

-dimensional computed solution with the numerical quality of the sequence of

d

-dimensional solutions that can be extracted form it. This enables to prescribe convergence threshold when solving the

(d+1)

-dimensional problem that ensures the numerical quality of the

d

-dimensional solutions that will be extracted from the

(d+1)

-dimensional computed solution once the solver has converged. The above mentioned features are illustrated on a set of academic examples of varying dimensions and sizes.Nous considérons la résolution de systèmes linéaires avec une structure de produit tensoriel en utilisant un algorithme GMRES. Afin de faire face à la complexité de calcul en grande dimension, à la fois en termes d'opérations en virgule flottante et d'exigences de mémoire, notre algorithme est basé sur une représentation tensorielle à faible rang, à savoir le format Tensor Train. Dans un cadre d'analyse d'erreur inverse, nous montrons comment l'approximation tensorielle affecte la précision de la solution calculée. Dans une perspective d'erreur inverse, nous étudions les situations où le problème de dimension

(d+1)

à résoudre résulte de la concaténation d'une séquence de problèmes de dimension

d

(comme les problèmes d'opérateurs linéaires paramétriques ou de second membres paramétriques), nous fournissons des bornes d'erreur inverse pour relier la précision de la solution calculée de dimension

(d+1)

à la qualité numérique de la séquence de solutions de dimension

d

qui peut être extraite de celle-ci. Cela permet de prescrire un seuil de convergence lors de la résolution du problème à

(d+1)

dimensions qui garantit la qualité numérique des solutions à

d

dimensions qui seront extraites de la solution calculéeenà

(d+1)

dimensions une fois que le solveur aura convergé. Les caractéristiques mentionnées ci-dessus sont illustrées sur un ensemble d'exemples académiques de dimensions et de tailles variables

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Extension de l'analyse des correspondances à des donnéees multi-dimensionnelles : un point de vue géométrique

Author: Coulaud Olivier
Franc Alain
Iannacito Martina
Publication venue: HAL CCSD
Publication date: 07/11/2021
Field of study

This paper presents an extension of Correspondence Analysis (CA) to tensors through High Order Singular Value Decomposition (HOSVD) from a geometric viewpoint. Correspondence analysis is a well-known tool, developed from principal component analysis, for studying contingency tables. Different algebraic extensions of CA to multi-way tables have been proposed over the years, nevertheless neglecting its geometric meaning. Relying on the Tucker model and the HOSVD, we propose a direct way to associate with each tensor mode a point cloud. We prove that the point clouds are related to each other. Specifically using the CA metrics we show that the barycentric relation is still true in the tensor framework. Finally two data sets are used to underline the advantages and the drawbacks of our strategy with respect to the classical matrix approaches.Ce document présente une extension de l'analyse des correspondances aux tenseurs par la décomposition en valeurs singulières d'ordre élevé (HOSVD) d'un point de vue géométrique. L'analyse des correspondances est un outil bien connu, développé à partir de l'analyse en composantes principales, pour étudier les tables de contingence. Différentes extensions algébriques de l'analyse des correspondances aux tables à voies multiples ont été proposées au fil des ans. En nous appuyant sur le modèle de Tucker et la HOSVD, nous proposons d'associer à chaque mode d'un tenseur un nuage de points. Nous établissons un lien entre les cordonnées de ces différents nuages. Une telle relation est classique en Analyse Factorielle des Correspondances (AFC) pour justifier la projection simultanée des profils lignes et profils colonnes d'une table de contingence (d'où le nom de correspondance). Nous étendons une telle relation barycentrique aux liens entre les nuages de points associés aux différents modes de l'Analyse Factorielle des Correspondances Multiple d'un tenseur, construite via la HOSVD avec les métriques de l'AFC

INRIA a CCSD electronic archive server

High order singular value decomposition per la stima della biodiversità vegetale

Author: Bernardi Alessandra
Iannacito Martina
Rocchini Duccio
Publication venue: HAL CCSD
Publication date: 29/11/2019
Field of study

We propose a new method to estimate plant biodiversity with Rényi and Rao indexes through the so called High Order Singular Value Decomposition (HOSVD) of tensors. Starting from NASA multispectral images we evaluate biodiversity and we compare original biodiversity estimates with those realised via the HOSVD compression methods for big data. Our strategy turns out to be extremely powerful in terms of storage memory and precision of the outcome. The obtained results are so promising that we can support the efficiency of our method in the ecological framework

INRIA a CCSD electronic archive server

Les variantes rétro-stables de GMRES en précision variable

Author: Agullo Emmanuel
Coulaud Olivier
Giraud Luc
Iannacito Martina
Marait Gilles
Schenkels Nick
Publication venue: HAL CCSD
Publication date: 01/09/2022
Field of study

In the context where the representation of the data is decoupled from the arithmetic used to process them, we investigate the backward stability of two backward-stable implementations of the GMRES method, namely the so-calledModified Gram-Schmidt (MGS) and the Householder variants. Considering data may be compressed to alleviate the memory footprint, we are interested in the situation where the leading part of the rounding error is related to the datarepresentation. When the data representation of vectors introduces componentwise perturbations, we show that the existing backward stability analyses of MGS-GMRES and Householder-GMRES still apply. We illustrate this backward stability property in a practical context where an agnostic lossy compressor is employed and enables the reduction of the memory requirement to store the orthonormal Arnoldi basis or the Householder reflectors. Although technical arguments of the theoretical backward stability proofs do not readily apply to the situation where only the normwise relative perturbations of the vector storage can be controlled, we show experimentally that the backward stability is maintained; that is, the attainable normwise backward error is of the same order as the normwise perturbations induced by the data storage. We illustrate it with numerical experiments in two practical different contexts. The first one corresponds to the use of an agnostic compressor where vector compression is controlled normwise. The second one arises in the solution of tensor linear systems, where low-rank tensor approximations based on Tensor-Train is considered to tackle the curse of dimensionality.Dans le contexte où la représentation des données est découplée de l’arithmétique utilisée pour les traiter, nous étudions la stabilité inverse des deux implémentations stables de la méthode GMRES, à savoir la variante dite Modified Gram-Schmidt (MGS) et la variante Householder. Considérant que les données peuvent être compressées pour réduire l’empreinte mémoire, nous nous intéressons à la situation où la partie principale de l’erreur d’arrondi est liée à la représentation des données. Lorsque la représentation des données des vecteurs introduit des perturbations par composantes, les analyses de stabilité inverse existantes de MGS-GMRES [27] et Householder-GMRES [15] restent applicables. Nous illustrons cette propriété de stabilité dans un contexte pratique pratique où un compresseur agnostique à perte est utilisé et permet de réduire la mémoire nécessaire pour stocker la base orthonormale d’Arnoldi ou les réflecteurs de Householder. Bien que les arguments techniques des preuves théoriques de de stabilité inversene s’appliquent pas facilement à la situation où seules les perturbations relatives en norme sont utilisées, nous montrons expérimentalement que la stabilité inverse est maintenue ; c’est-à-dire que l’erreur inverse atteignable est du même ordre que les perturbations normalisées induites par le stockage des données. Nous rapportons des expériences numériques dans deux contextes pratiques différents. Le premier correspond à l’utilisation d’un compresseur agnostique. Le deuxième se présente dans la résolution de systèmes linéaires tensoriels, définis sur un produit tensoriel d’espaces linéaires, où les approximations tensorielles à faible rang basées sur Tensor-Train [26] est envisagée pour lutter contre la malédiction de la dimensionnalité

INRIA a CCSD electronic archive server

From zero to infinity: Minimum to maximum diversity of the planet by spatio-parametric Rao's quadratic entropy

Author: Bacaro Giovanni
Da Re Daniele
Feoli Enrico
Foody Giles M
Furrer Reinhard
Harrigan Ryan J
Iannacito Martina
Kleijn David
Lenoir Jonathan
Lin Meixi
Malavasi Marco
Marcantonio Matteo
Marchetto Elisa
Meyer Rachel S
Moudry Vítězslav
Payne Davnah
Ricotta Carlo
Rocchini Duccio
Schneider Fabian D
Thornhill Andrew H
Thouverai Elisa
Vicario Saverio
Wayne Robert K
Šímová Petra
Publication venue: Wiley-Blackwell Publishing Ltd
Publication date: 17/07/2023
Field of study

Aim: The majority of work done to gather information on the Earth's biodiversity has been carried out using in-situ data, with known issues related to epistemology (e.g., species determination and taxonomy), spatial uncertainty, logistics (time and costs), among others. An alternative way to gather information about spatial ecosystem variability is the use of satellite remote sensing. It works as a powerful tool for attaining rapid and standardized information. Several metrics used to calculate remotely sensed diversity of ecosystems are based on Shannon’s information theory, namely on the differences in relative abundance of pixel reflectances in a certain area. Additional metrics like the Rao’s quadratic entropy allow the use of spectral distance beside abundance, but they are point descriptors of diversity, that is they can account only for a part of the whole diversity continuum. The aim of this paper is thus to generalize the Rao’s quadratic entropy by proposing its parameterization for the first time. Innovation: The parametric Rao’s quadratic entropy, coded in R, (a) allows the representation of the whole continuum of potential diversity indices in one formula, and (b) starting from the Rao’s quadratic entropy, allows the explicit use of distances among pixel reflectance values, together with relative abundances. Main conclusions: The proposed unifying measure is an integration between abundance- and distance-based algorithms to map the continuum of diversity given a satellite image at any spatial scale. Being part of the rasterdiv R package, the proposed method is expected to ensure high robustness and reproducibility

Research UNE

Algèbre linéaire numérique et analyse de données en grande dimensions utilisant le format tenseur

Author: Iannacito Martina
Publication venue: HAL CCSD
Publication date: 09/12/2022
Field of study

This work aims to establish which theoretical properties of classical linear algebra techniques developed in two different contexts, that are numerical linear algebra and data analysis, are saved and which are lost, once they are extended to tensors through tensor compression algorithms. Moreover, this manuscript aims to highlight the benefits and the flaws of a tensor approach compared to its classical matrix counterpart in the two considered frameworks paying particular attention to the computational aspects.In the numerical linear algebra part, we study experimentally the rounding error effects for an iterative solver and several orthogonalization kernels, when they are extended to the tensor framework through the Tensor Train (TT) formalism. In all the considered algorithms, we introduce additional rounding steps, through the TT-rounding algorithm to face memory constraints, always crucial when dealing with tensors. Our experiments suggest that for these algorithms the classical bounds based on rounding error analysis hold, replacing the unit round-off of the finite precision arithmetic with the precision of the TT-rounding algorithm.The considered iterative solver is Generalised Minimal RESidual (GMRES). We compare our version of TT-GMRES with the previous realization, showing numerically its major robustness. Moreover, we address the problem of solving simultaneously through TT-GMRES many linear systems in TT-format and establishing bounds that guarantee the numerical quality of the individual extracted solutions.The classical orthogonalization schemes generalized to tensors are CGS, CGS2, MGS, MGS2, Householder, and Gram. To complete their study, we study how they affect the performance of the subspace iteration eigensolver extended to tensors through the TT-format.In the data analysis part, we investigate two data analysis techniques, one meant for categorical variables data and one for climate data, generalized to tensors through the Tucker format, highlighting the benefits and the flaws of this choice compared to the corresponding matrix approach.A well-known tool for visualizing and interpreting categorical two-variable tables is Correspondence Analysis (CA). We study geometrically the generalization of CA to multiway tables through the Tucker tensor decomposition technique, contributing to the understanding of the MultiWay Correspondance Analysis (MWCA). The theoretical results are complemented by examples of MWCA applied to real-life datasets. In particular, we perform the MWCA on the original ecology dataset made available in the Malabar project.For climate data, we consider the Empirical Orthogonal Function (EOF) analysis. In particular, we show how to retrieve the final EOF outcome relying on the Tucker compressed format. This approach may be computationally beneficial if the data are made available directly in Tucker format. For completeness, we study numerically the effect of the data approximation through the Tucker model on the final EOF outcome.L'objectif de ce travail est d'établir quelles propriétés théoriques des techniques d'algèbre linéaire classique développées dans deux contextes différents, que sont l'algèbre linéaire numérique et l'analyse de données, sont préservées et lesquelles sont perdues, une fois qu'elles sont étendues aux tenseurs grâce à des algorithmes de compression tensorielle de rang faible. En outre, ce manuscrit vise à mettre en évidence les avantages et les inconvénients d'une approche tensorielle par rapport à son homologue matricielle classique dans les deux domaines considérés, en accordant une attention particulière aux aspects computationels.Dans la partie d'algèbre linéaire numérique, nous étudions expérimentalement les effets des erreurs d'arrondi sur un solveur itératif et plusieurs méthodes d'orthogonalisation, lorsqu'ils sont étendus aux tenseurs par le formalisme du Train Tensoriel (TT). Dans tous les algorithmes considérés, nous introduisons des étapes d'arrondi supplémentaires, avec l'algorithme de compression TT-rounding, pour faire face aux contraintes de mémoire, toujours cruciales lorsqu'on traite des tenseurs. Nos tests suggèrent que pour ces algorithmes, les limites classiques dues à la propagation des erreurs d'arrondi restent valables, en remplaçant la précision de l'arithmétique par celle de l'algorithme TT-rounding.Le solveur itératif considéré est le Generalised Minimal RESidual (GMRES). Nous comparons notre version TT-GMRES avec une réalisation précédente, en montrant numériquement sa grande robustesse. De plus, nous abordons le problème de la résolution simultanée par TT-GMRES de nombreux systèmes linéaires au format TT et établissons des bornes qui garantissent la qualité numérique de la solution individuelle extraite.Les schémas classiques d'orthogonalisation généralisés aux tenseurs sont CGS, CGS2, MGS, MGS2, Householder et Gram. Pour compléter leur étude, nous étudions comment ils affectent les performances du solveur de problèmes aux valeurs propres basé sur des itérations de sous-espaces étendu aux tenseurs avec le format TT.Dans la partie analyse de données, nous étudions deux techniques d'analyse, l'une destinée aux données de variables catégorielles et l'autre aux données climatiques, généralisées aux tenseurs par le biais du format Tucker, en soulignant les avantages et les inconvénients de ce choix par rapport à l'approche matricielle correspondante.L'Analyse des Correspondances (AC) est un outil bien connu pour visualiser et interpréter des tableaux catégoriels à deux variables. Nous étudions géométriquement la généralisation de l'AC aux tableaux multivoies par la technique de décomposition tensorielle de Tucker, contribuant ainsi à la compréhension de l'Analyse des Correspondances MultiVoies (ACMV). Les résultats théoriques sont complétés par des exemples de ACMV appliqués à des ensembles de données. En particulier, nous réalisons l'ACMV sur le jeu de données écologique original mis à notre disposition dans le cadre du projet Malabar.Pour les données climatiques, nous considérons l'analyse de la Fonction Orthogonale Empirique (FOE). En particulier, nous montrons comment récupérer le résultat final de l'FOE en s'appuyant sur le format compressé de Tucker. Cette approche peut être avantageuse sur le plan du calcul si les données sont disponibles directement au format Tucker. Pour être complet, nous étudions numériquement l'effet de l'approximation des données par le modèle de Tucker sur le résultat FOE final

Thèses en Ligne

INRIA a CCSD electronic archive server

Theses.fr

Numerical linear algebra and data analysis in large dimensions using tensor format

Author: IANNACITO Martina
Publication venue
Publication date: 27/03/2023
Field of study

L'objectif de ce travail est d'établir quelles propriétés théoriques des techniques d'algèbre linéaire classique développées dans deux contextes différents, que sont l'algèbre linéaire numérique et l'analyse de données, sont préservées et lesquelles sont perdues, une fois qu'elles sont étendues aux tenseurs grâce à des algorithmes de compression tensorielle de rang faible. En outre, ce manuscrit vise à mettre en évidence les avantages et les inconvénients d'une approche tensorielle par rapport à son homologue matricielle classique dans les deux domaines considérés, en accordant une attention particulière aux aspects computationels.Dans la partie d'algèbre linéaire numérique, nous étudions expérimentalement les effets des erreurs d'arrondi sur un solveur itératif et plusieurs méthodes d'orthogonalisation, lorsqu'ils sont étendus aux tenseurs par le formalisme du Train Tensoriel (TT). Dans tous les algorithmes considérés, nous introduisons des étapes d'arrondi supplémentaires, avec l'algorithme de compression TT-rounding, pour faire face aux contraintes de mémoire, toujours cruciales lorsqu'on traite des tenseurs. Nos tests suggèrent que pour ces algorithmes, les limites classiques dues à la propagation des erreurs d'arrondi restent valables, en remplaçant la précision de l'arithmétique par celle de l'algorithme TT-rounding.Le solveur itératif considéré est le Generalised Minimal RESidual (GMRES). Nous comparons notre version TT-GMRES avec une réalisation précédente, en montrant numériquement sa grande robustesse. De plus, nous abordons le problème de la résolution simultanée par TT-GMRES de nombreux systèmes linéaires au format TT et établissons des bornes qui garantissent la qualité numérique de la solution individuelle extraite.Les schémas classiques d'orthogonalisation généralisés aux tenseurs sont CGS, CGS2, MGS, MGS2, Householder et Gram. Pour compléter leur étude, nous étudions comment ils affectent les performances du solveur de problèmes aux valeurs propres basé sur des itérations de sous-espaces étendu aux tenseurs avec le format TT.Dans la partie analyse de données, nous étudions deux techniques d'analyse, l'une destinée aux données de variables catégorielles et l'autre aux données climatiques, généralisées aux tenseurs par le biais du format Tucker, en soulignant les avantages et les inconvénients de ce choix par rapport à l'approche matricielle correspondante.L'Analyse des Correspondances (AC) est un outil bien connu pour visualiser et interpréter des tableaux catégoriels à deux variables. Nous étudions géométriquement la généralisation de l'AC aux tableaux multivoies par la technique de décomposition tensorielle de Tucker, contribuant ainsi à la compréhension de l'Analyse des Correspondances MultiVoies (ACMV). Les résultats théoriques sont complétés par des exemples de ACMV appliqués à des ensembles de données. En particulier, nous réalisons l'ACMV sur le jeu de données écologique original mis à notre disposition dans le cadre du projet Malabar.Pour les données climatiques, nous considérons l'analyse de la Fonction Orthogonale Empirique (FOE). En particulier, nous montrons comment récupérer le résultat final de l'FOE en s'appuyant sur le format compressé de Tucker. Cette approche peut être avantageuse sur le plan du calcul si les données sont disponibles directement au format Tucker. Pour être complet, nous étudions numériquement l'effet de l'approximation des données par le modèle de Tucker sur le résultat FOE final.This work aims to establish which theoretical properties of classical linear algebra techniques developed in two different contexts, that are numerical linear algebra and data analysis, are saved and which are lost, once they are extended to tensors through tensor compression algorithms. Moreover, this manuscript aims to highlight the benefits and the flaws of a tensor approach compared to its classical matrix counterpart in the two considered frameworks paying particular attention to the computational aspects.In the numerical linear algebra part, we study experimentally the rounding error effects for an iterative solver and several orthogonalization kernels, when they are extended to the tensor framework through the Tensor Train (TT) formalism. In all the considered algorithms, we introduce additional rounding steps, through the TT-rounding algorithm to face memory constraints, always crucial when dealing with tensors. Our experiments suggest that for these algorithms the classical bounds based on rounding error analysis hold, replacing the unit round-off of the finite precision arithmetic with the precision of the TT-rounding algorithm.The considered iterative solver is Generalised Minimal RESidual (GMRES). We compare our version of TT-GMRES with the previous realization, showing numerically its major robustness. Moreover, we address the problem of solving simultaneously through TT-GMRES many linear systems in TT-format and establishing bounds that guarantee the numerical quality of the individual extracted solutions.The classical orthogonalization schemes generalized to tensors are CGS, CGS2, MGS, MGS2, Householder, and Gram. To complete their study, we study how they affect the performance of the subspace iteration eigensolver extended to tensors through the TT-format.In the data analysis part, we investigate two data analysis techniques, one meant for categorical variables data and one for climate data, generalized to tensors through the Tucker format, highlighting the benefits and the flaws of this choice compared to the corresponding matrix approach.A well-known tool for visualizing and interpreting categorical two-variable tables is Correspondence Analysis (CA). We study geometrically the generalization of CA to multiway tables through the Tucker tensor decomposition technique, contributing to the understanding of the MultiWay Correspondance Analysis (MWCA). The theoretical results are complemented by examples of MWCA applied to real-life datasets. In particular, we perform the MWCA on the original ecology dataset made available in the Malabar project.For climate data, we consider the Empirical Orthogonal Function (EOF) analysis. In particular, we show how to retrieve the final EOF outcome relying on the Tucker compressed format. This approach may be computationally beneficial if the data are made available directly in Tucker format. For completeness, we study numerically the effect of the data approximation through the Tucker model on the final EOF outcome

Oskar Bordeaux

Algèbre linéaire numérique et analyse de données en grande dimensions utilisant le format tenseur

Author: Iannacito Martina
Publication venue: HAL CCSD
Publication date: 09/12/2022
Field of study

INRIA a CCSD electronic archive server